Curation of Dutch Regional Dictionaries
نویسندگان
چکیده
This paper describes the process of semi-automatically converting dictionaries from paper to structured text (database) and the integration of these into the CLARIN infrastructure in order to make the dictionaries accessible and retrievable for the research community. The case study at hand is that of the curation of 42 fascicles of the Dictionaries of the Brabantic and Limburgian dialects, and 6 fascicles of the Dictionary of dialects in Gelderland.
منابع مشابه
The Dutch LESLLA Corpus
This paper describes the Dutch LESLLA data and its curation. LESLLA stands for Low-Educated Second Language and Literacy Acquisition. The data was collected for research in this field and would have been disappeared if it were not saved. Within the CLARIN project Data Curation Service the data was made into a spoken language resource and made available to other researchers.
متن کاملStudy of the foundation, models and issues of research data curation and management in scientific and academic environments
Background and Aim: The purpose of this paper is to study, identifying and discuss the foundation and concepts, models and frameworks, dimensions and challenges of research data curation and management in scientific and academic environments. Method: This article is a review article and library method was used to collect scientific and research texts in this field. In this research, external an...
متن کاملEpitopes in ChEBI – A Collaboration with the IEDB
ChEBI evolution: Since its inception in 2004, ChEBI has evolved from an illustrated dictionary of terms into a semantically rich knowledge base with an internal hierarchy that organises entities by their molecular structure types and potential rôles. Its 2009 acquisition of the BioFocus drug discovery dataset [2] exponentially increased the number of entities from 20,000 to 500,000. ChEBI conti...
متن کاملA Dictionary-based Approach to Racism Detection in Dutch Social Media
We present a dictionary-based approach to racism detection in Dutch social media comments, which were retrieved from two public Belgian social media sites likely to attract racist reactions. These comments were labeled as racist or non-racist by multiple annotators. For our approach, three discourse dictionaries were created: first, we created a dictionary by retrieving possibly racist and more...
متن کاملThe Integrated Language Database of 8th - 21st-Century Dutch
The Institute for Dutch Lexicology (INL) has a long-standing tradition in corpus-based lexicography. The results include electronic scholarly dictionaries of Dutch covering the vocabulary from 1200 up to 1976, linguistically annotated electronic text corpora of historical and present-day Dutch, and computational lexica. Added value to these data is given in an on-going long-term INL project, th...
متن کامل